Assessing nocturnal scratch with actigraphy in atopic dermatitis patients

Nocturnal scratch is one major factor leading to impaired quality of life in atopic dermatitis (AD) patients. Therefore, objectively quantifying nocturnal scratch events aids in assessing the disease state, treatment effect, and AD patients’ quality of life. In this paper, we describe the use of actigraphy, highly predictive topological features, and a model-ensembling approach to develop an assessment of nocturnal scratch events by measuring scratch duration and intensity. Our assessment is tested in a clinical setting against the ground truth obtained from video recordings. The new approach addresses unmet challenges in existing studies, such as the lack of generalizability to real-world applications, the failure to capture finger scratches, and the limitations in the evaluation due to imbalanced data in the current literature. Furthermore, the performance evaluation shows agreement between derived digital endpoints and the video annotation ground truth, as well as patient-reported outcomes, which demonstrated the validity of the new assessment of nocturnal scratch.


Quaternion Approach
Accelerometers measure dynamics, e.g., acceleration, and angular velocity, with respect to the device orientation. For example, measurements are recorded with respect to the local (body) reference frame. So, as the device is rotated in the global reference frame, the acceleration due to gravity picked up by the accelerometer is generally not static with respect to the local frame. Thus if we want to remove gravity, we effectively need to know the device's orientation. Once this is known, we can trivially remove gravity by projecting it out. The problem is then reduced to determining the device orientation. This may be done by solving a differential equation for the orientation depending on the angular velocity (the instantaneous change in orientation measured by the gyroscope in the local frame). Our strategy will thus be as follows: (1) identify stationary and non-stationary regions of signal through e.g., standard deviation thresholds, (2) estimate the device orientation in the stationary regions, (4) use the estimated orientation from stationary regions as an initial value for the ODE used to find the orientation in the adjacent non-stationary regions, and (5) project out gravity.
We formulate our approach in terms of quaternions due to their nice theoretical properties. Quaternions constitute a representation of the 3D rotation group, SO(3), and may be used to encode these rotations more efficiently and stably than rotation matrices with Euler angles. Ostensibly quaternions are 4D vectors with a scalar and vector component whose components form a 4D associated normed division algebra over R. A rotation of angle θ about the unit vector n may be written as a quaternion: If we consider vectors in R 3 to be quaternions with zero scalar component, the rotation of a vector v by a quaternion R is given by The literature [1,12] derives the following ODE for quaternion evolution using the angular velocity in the local reference frame: It is important to realize that because quaternions do not commute, a closed-form solution via e.g., exponentiation is not readily possible. Then if g(t) is the gravity vector in the local frame, g(t) = R(t)g(0)R −1 (t) where the dynamics of R(t) are governed by Supplementary Equation 3 and g(0) is the initial orientation obtained by an adjacent stationary region where we have high confidence in the orientation (we believe the only contribution to acceleration is from gravity itself).

Mitigating Drift Error
While the above theory is, in principle, correct, it is plagued by errors from device precision and finite timestep size in solving the ODE (drift). If these are not addressed, the deviation of the estimated device orientation from the true orientation can be substantial, especially far from the stationary regions. So, to effectively remove gravity, we must add some error corrections. One easy way to do this is to estimate the gravity vector's orientation with rough estimates of error and fuse it with the predicted dynamic value from the ODE with the estimated drift error. Getting good error estimates can be tricky, but a heuristic approach appears sufficient. We get estimates of the gravity orientation by heavily filtering the acceleration component-wise with a low-pass Butterworth filter (f c = 0.4 Hz) and then normalizing at each timestep. We approximate the uncertainty of the estimate as u(t) = (||a|| − 1) 2 + 0.1||ω|| 2 where ω has units rad/s and acceleration, a, has units m/s 2 . Note that this is only a qualitative estimate with the property that it is small for regions near stationarity and large otherwise. The overall scale relative to the estimated drift error controls the Kalman gain of our fusion /textiti.e., the extent to which R(t) consists of a contribution from the provided estimate or the ODE. In order to further emphasize the role of stationary regions, we also pass the uncertainty, u(t), through a smoothed Heaviside filter: 1+e −x . Let R est (t) and R(t) be the quaternions that rotate g est (0) to g est (t) and g est (0) to g(t) respectively where g est (t) is the estimated gravity at each timestep, t, and g(t) is the dynamic gravity vector solved from our ODE and fused with the estimate. We define e(t) as our propagated error with e(0) = u(0) and e ODE to be the estimated error from the discretization of the ODE, then e(t+1) = e(t) 2 + e ODE . The Kalman gain [17] is g = where R mod is the quaternion from the ODE that would update the previous quaternion with the smallest rotation. The propagated error must then be updated to account for the effect of fusion via e(t) ← e(t) √ 1 − g. This gives the desired prescription for solving the ODE while leveraging regions near stationarity to mitigate drift.
Topological Data Analysis (TDA) is a rising field in the intersection of pure mathematics, statistics, and machine learning [4,6,10,11,23,24]. It concerns the "shape" of data in the form of Betti numbers, a classic subject in algebraic topology. Persistence diagrams, one of the main tools from TDA, have been proven effective and successful in many scientific disciplines (see survey article [16] for a list of applications). Recently, researchers have also found that TDA offers different aspects to analyze the time series data [7,13,14,[18][19][20]22]. In this work, we consider the sublevel set persistence diagram for a time series and extract features from it. Mathematical details about sub-level set filtration and persistence diagram can be found, e.g., in [5]. This Supplementary document illustrates these mathematical concepts by an example and describes the TDA features we used in the main document.
We first describe the definition of the sublevel set. Given a function f : R → R and a threshold t, the sublevel set of f at t is defined as  Figure S1(c)-(h), as the threshold value t increases, we observe that the sublevel set increases, too, i.e., more of f will be included as t becomes larger. It can be shown that f t1 ⊆ f t2 , for any t 1 ≤ t 2 , and this relation is called the sublevel set filtration. Tools in TDA concern tracking changes of Betti numbers in a filtration. Persistence diagrams of the sublevel set filtration are the media to store such information. We now describe the sublevel set persistence diagram for a function by the following example. As we have seen Supplementary Figure S1 Figure S1(h)). During the process, we focus on tracking how the sublevel set appears (is born) or disappears (dies). In Supplementary Figure S1(b), f 2 contains a single point depicted as blue color and labeled as Roman "I". In Supplementary Figure S1(c), we see that the region "I" grows and that a new region is born "II". In Supplementary Figure  S1(d), we see that both regions "I" and "II" grow, and we also observe that a new region is born "III". In Supplementary Figure S1(e), we see that regions "I", "II", and "III" grow, and we also observe that a new region is born "IV". In Supplementary Figure S1(f), we see that region "I" and "IV" merge, and since "IV" was born later than "I", we say that "IV" dies at threshold value 12. In Supplementary Figure S1(g), we see that region "II" and "III" merge and since "III" was born later than "II", we say that "III" dies at threshold value 13. Lastly, in Supplementary Figure S1(h), we see that region "I" and "II" merge, and since "II" was born later than "I", we say that "II" dies at threshold value 18. We observe that the region "I" actually never dies. By the convention in the TDA field [10], we say that the region "I" has the death value ∞. Therefore, the persistence diagram is D = {(9, 12), (7,13), (6,18), (2, ∞)}. Algorithms for computing persistence diagrams are well-studied. We use GUDHI [21] to compute persistence diagrams in this work.
utilize persistence diagrams to perform machine learning algorithms. However, directly applying machine learning algorithms on PDs is a challenging task due to lacking Hilbert space structure [3,15]. Summarizing or vectorizing the persistence diagram is a common approach to mitigate the difficulty and is currently an active research area in the field (see e.g., [2,9]). In this work, we consider persistence statistics [7] and Gaussian persistence curve [8]. For persistence statistics, we consider two sets of numbers: lifespan For each set of numbers M and L, we compute five basic summary statistics: the sample mean, standard deviation, skewness, kurtosis, and entropy. In addition to persistence statistics, we consider the norm of Gaussian persistence curves [8], which is defined below:  Figure S1 as an example. In practice, we treat ∞ as the largest value in the time series. Thus, in this example, "∞" will be replaced by 18, so D = {(9, 12), (7,13), (6,18), (2,18) Supplementary Figure S3: Plots of raw accelerometer and gyroscope signals overlapped with video ground truth and movement detection results. (a) The algorithm correctly identifies an annotated scratching event as movement (red and green shades overlapped). (b) Annotated scratching event with flat signal and was not identified as movement by the algorithm (no overlap between red and green shades). (c) Annotated scratching event with slight noise and was not identified as movement by the algorithm (no overlap between red and green shades.